Semantic Context Detection Using Audio Event Fusion
نویسندگان
چکیده
Semantic-level content analysis is a crucial issue in achieving efficient content retrieval andmanagement.We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.
منابع مشابه
Semantic Context Detection Using Audio Event Fusion: Camera-Ready Version
Semantic-level content analysis is a crucial issue in achieving efficient content retrieval andmanagement.We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this w...
متن کاملEvent Detection in Basketball Video Using Multiple Modalities
Semantic sports video analysis has attracted more and more attention recently. In this paper, we present a basketball event detection method by using multiple modalities. Instead of using low-level features, the proposed method is built upon visual and auditory midlevel features i.e. semantic shot classes and audio keywords. Promising event detection results have been achieved. By heuristically...
متن کاملIRIT @ TRECVid 2010 : Hidden Markov Models for Context-aware Late Fusion of Multiple Audio Classifiers
This notebook paper describes the four runs submitted by IRIT at TRECVid 2010 Semantic Indexing task. The four submitted runs can be described and compared as follows: • Run 4 – late fusion (weighted sum) of multiple audio-only classifiers output • Run 3 – context-aware re-rank of run 4 using hidden Markov model • Run 2 – context-aware late fusion of multiple audio classifiers output with hidde...
متن کاملAudio-concept features and hidden Markov models for multimedia event detection
Multimedia event detection (MED) on user-generated content is the task of finding an event, e.g., a Flash mob or Attempting a bike trick, using its content characteristics. Recent research has focused on approaches that use semantically defined “concepts” trained with annotated audio clips. Using audio concepts allows us to show semantic evidence of their relationship to events, by looking at t...
متن کاملFeature-Level Decision Fusion for Audio-Visual Word Prominence Detection
Common fusion techniques in audio-visual speech processing operate on the modality level. I.e. they either combine the features extracted from the two modalities directly or derive a decision for each modality separately and then combine the modalities on the decision level. We investigate the audio-visual processing of linguistic prosody, more precisely the extraction of word prominence. In th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- EURASIP J. Adv. Sig. Proc.
دوره 2006 شماره
صفحات -
تاریخ انتشار 2006